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ABSTRACT 

Equating tests from different calibrations under item 
response theory (IRT) requires calculation of the slope and intercept 
of the appropriate linear transformation. Two methods have been 
proposed recently for equating graded response items under IRT, a 
test characteristic curve method and a minimum chi-square method. 
These two methods are comparec with three mean and sigma methods 
using computer simulations. Ten- and 30~item tests were simulated for 
300 and 1,000 examinees. Results under these simulated conditions 
indicate that recovery is good for all conditions. Recovery is 
slightly better for the long test and the large sample, but 
differences among all simulated conditions are quite small. 
Essentially no differences are observed among the linking methods. 
One could feel relatively comfortable using any of the five equating 
methods when ability and item location distributions are 
wel 1 -matched. The simplest equating method is the B. H. Loyd and H. 
D. Hoover (LH) mean and sigma method (1980). The minimum chi-square 
has some advantage in ease of use over the test characteristic curve 
method, but both are more complicated than the LH method. Eight 
tables present analysis results. (SLD) 
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Equating in the Graded Response Model 2 
Abstract 

Equating tests from different calibrations under Item Response Theory (IRT) re- 
quires calculation of the slope and intercept of the appropriate linear transformation, 
Two methods have been proposed recently for equating of graded response items un- 
der IRT, a test characteristic curve method and a minimum chi-square method. In 
the present study, we provide a comparison using simulated data sets between these 
two methods and three mean and sigma methods. 

Index terms: equating, graded response models item response theory. 
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A Comparison of Equating Methods 
Under the Graded Response Model 

The metrics yielded by current item response theory (IRT) estimation algorithms 
from separate calibrations for the same items are unique up to a linear transformation. 
This means that, to equate tests which have been calibrated separately, it is 
necessciry to determine the slope and intercept of the linear equation which yields 
the appropriate transformation. In the present paper, we compare results from five 
methods for determining these two transformation coefficients for Samejima's (1969) 
graded response model. 

Three general classes of equating methods have been described for the dichotomous 
IRT model: characteristic curve methods, minimum chi-square methods, and mean 
and sigma methods. Characteristic curve methods (cf. Divgi, 1980; Haebara, 1980; 
Stocking & Lord, 1983) make use of the information available from both the item 
^crimination and item difficulty parameters. This class of methods is specifically 
designed to obtain the slope and intercept coefficients by minimizing some measure 
of the difference between the test characteristic curves estimated in each sample. The 
Stocking and Lord procedure obtains the two equating coefficients by minimizing a 
quadratic loss function based on differences in true scores jdelded by the two test 
calibrations. Baker (1992) extended this procedure to the graded response model. 
The minimum chi-square method proposed by Divgi (1985) for dichotomously scored 
items, which uses estimates of both item discrimination and difficulty parameters as 
well as their standard errors, is computationally simpler than the Stocking and Lord 
procedure. Kim and Cohen (in press) have extended the minimum chi-square method 
to the graded response model. 

Several methods, generally known as mean and sigma methods, have been 
proposed that rely on the distributions of item difficulty and discrimination estimates 
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(cf. Bejar &c Wingersky, 1981; Cook, Eignor, &c Hutton, 1979; Linn, Levine, Hastings, 
k Wardrop, 1980, 1981; Marco, 1977; Vale, 1986). Mean and Sigma methods are 
presently only described for the dichotomous model. Comparisons of linking results 

dichotomous items suggest that for large samples and long tests, few differences 
exist among the weighted mean and sigma method of Linn et aL (1980, 1981), the 
method by Stocking and Lord (1983), and Divgi^s (1985) minimum chi- square method 
(Kim & Cohen, 1992). In the present study, we describe three variations of mean and 
sigma methods for Samejima's (1969) graded response model and compare the results 
to those obtained using the methods by Baker (1992) and Kim and Cohen (in press). 

Equating Under IRT. Lord (1980) has shown that, under IRT, the relationship 
of the metric between any two calibrations of the same items from different groups in 
the same population is linear. Thus, when the estimates from the second calibration 
are to be transformed to the metric of the first, the treuisformed estimates of item 
discrimination and item difficulty parameters of item j for the dichotomously scored, 
two-parameter IRT model are given by 

a;, = a,,M (1) 

and 

b)^^Abj2 + B, (2) 

where * indicates a transformed value, the subscript 2 refers to the calibration from 
the second group, A is the slope coefficient, and B is the intercept coefficient. 
The value of the transformed ability estimate of person i can be expressed as 

ei, = Aei2-\-B. (3) 

The task of equatinjr the two metrics is to find the appropriate equating coef&cients 
A and B. 
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Many different equating situations exist (cf. Vale, 1986). In this paper, we consider 
only that situation for which a set of common items is administered to two groups of 
examinees. 

Samejima's Graded Response Model. Under Samejima^s graded response 
model (Samejimc;, 1969), an item possesses mj ordered categories and the examinee 
is permitted to select only one. Item parameters are estimated under the graded 
response model via the use of the — 1 boundary characteristic curves (BCCs). Each 
of the BCCs represents the cumulative probablity of selecting response categories 
greater than the category of interest (Samejima, 1969). The BCCs for item j 
are characterized by an item discrimination parameter aj and the mj — 1 location 
parameters bjfc* 

The BCC in logistic form can be defined as 

P,t(tfO = [1 + exp{-o,(di - bju)}]-' (4) 

In the case of the metric from the "second group being equated to that of the first, 
the transformation for the graded response model can be obtained via 

S-a = (5) 

and 

b*ik2 = ^f>jk2 + B. (6) 

Samejima (1969) defines the operational characteristic curve (OCC) which shows 
the probability of selecting a category. The OCC can be obtained from the boundary 
curves as 

Pjii^i) when fc== 1 

Wi) = I A(m,-i)(«0 ^ when k m, (7) 

^j(k^i){^i) - otherwise. 
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Equating Methods for the Graded Response Model 

Mean and Sigma Equating Methods for the Graded Response Model. 
Three mean and sigma methods are described in this section for the graded response 
model: a mean and sigma method (MS) (Marco, 1977), the Loyd and Hoover (1980) 
method (LH), and a weighted mean and sigma method (WMS) by Linn et al. (1980, 
1981). 

For the MS method, it is assumed that the location paramet jr from the first group, 
bjki{j = l,...,n;A; = l,...,mj — 1), and from the second group, 6jfc2) are linearly 
related as 

bjki - Abjf,2 + B (8) 

and hence, 

aji = djijA. (9) 

The MS equating coefficients A and B are obtained from the following relation- 
ships: 

\=Al2^B (10) 
where T>i and ^2 are the means of the tj/tis and 6jfc2S, respectively, and 

5(6i) = A5(&2) (11) 

where S{h\) and 5(^2) are the standard deviations of ijfcis and ijjfc2S, respectively. 
Thus, we can obtain A ^nd B as 

A = S{h)IS{h2) (12) 

and 

B^h-Abi. (13) 
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Loyd and Hoover (1980) used the ratio of item discrimination parameter estimates 
from the two calibrations to obtain the A coefficient for their LH method. The LH 
method was originally used under the Rasch model. Baker and Al-Karni (1991) used 
the LH method to equate metrics for the three-parameter IRT model. Since we know 
Uji = cij^/Aj the A coefficient for the LH method for the graded response model can 
be obtained as 



A = aa/ai (14) 

and the B coefficient as 

B = bi-Ab2 (15) 

where a2 and ai are the means of the a^iS and a^as, respectively. 

An important problem with the MS and LH methods is that poorly estimated item 
difficulties can have a detrimental effect on the values of the A and B coefficients. 
To overcome this problem, Linn et al. (1980, 1981) modified the MS procedure to 
include a weighting of item difficulty estimates by the inverse of the larger of the 
squEO'ed standard errors (see also Stocking & Lord, 1983). Stocking and Lord (1983) 
further scaled this weight by the sum of weights across all items. For the graded 
response model, the scaled weight for the location parameter estimate for item j and 
category fc, Wjk, is defined as 



[inax{5g(b,fci),5£(6,»)}] 
The weighted estimates of the location pareuneters are obtained as 



(17) 
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and 



(18) 



Then, from the relationship 



(19) 



and 



(20) 



the coefficients A and B are obtained as 



A=sibr)/s{b^) 



(21) 



and 



(22) 



Two additional refinements to the weighted mean and sigma method, of possible 
interest, although not treated in the present study, are the robust mean and sigma 
method described by Bejar and Wingerskky (1981) and the iterative mean and sigma 
method by Stocking and Lord (1983), The objective of these methods is to further 
decrease the impact of deviant item location parameter estimates on the linking 
transformation using biweights described by Mosteller and Tukey (1977). 

Test Characteristic Curve Method for Graded Response Model. Baker 
(1992) extended the test characteristic curve method of Stocking and Lord (1983) 
to the graded response model. Baker's technique for obtaining the two equating 
coefficients was based on the minimization of the quadratic loss function 



where N is an arbitrary number of points along the first ability metric, T,i and 2^*2 
are the true scores for the first and second groups, respectively, defined as 



i=rl 



(23) 




(24) 
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and 

T:,-ti:^ii^p;k2m (25) 

i=lfc=:l 

where Ujk is the weight allocated to response category k for item j . Typically, although 
not necessarily, this weight is the same as the integer index of the category. 

The task is to find the values of A and B which minimize the quadratic loss 
function in Equation (23). In the present study, the characteristic curve method 
for the graded response model was used as implemented in the computer program 
EQUATE2 (Baker, 1993). 

Minimum Chi-Square for Graded Response Model. Kim and Cohen 
(in press) extended the minimum chi-square method of Divgi (1985) to the graided 
response model. The method is baaed on minimization of the quadratic function 

x'-txU-t(^fiUj^,> (26) 

jzzl jzzl 

where 

aj m j i jm 1 ijmj 2 ' 
iim,l = (^il' ^ill) " • ' ^3kU '"J ^j(m,-l)l)', 



(27) 

(28) 



and 

where ^jm^i is the estimated variance-covariance matrix of ^^^.j and SJ^^.j is the 
transformed estimated variance-covariance matrix oi The equating coefficients 

A and B are found by minimizing this differentiating with respect to A and J5. 
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Methods 

Data Generation. Data for this study were generated for two test lengths, 10 
and 30 items, and two sample sizes, 300 and 1,000 examinees, using the computer 
program GENIRV (Baker, 1986). The two factors, test length and sample size, were 
completely crossed to yield four conditions. All items had five categories. Each 
test was replicated five times by changing the random number seed. Generating 
parameters for the underlying ability and item difficulty distributions were both 
normal (0, 1). The underlying item discrimination parameters were generated 
uniformly over the interval from 1.0 to 2.0. All replication data sets for each of 
the test lengths had the sarne set of underlying parameters. 

Item Parameter Estimation. Marginal maximum likelihood item parameter 
estimates were obtained via the computer program MULTILOG (Thissen, 1991). 
Estimates from each replication were transformed to the metric of the generated data 
sets using each of the five equating methods. Since the equating task is that of a 
recovery study, the theoretical values for the linear equating coefficients are known 
apriori and are A = l.O and B = 0.0. 

Results 

In this study, the parameter estimates were first transformed to the underlying 
metric using each of the five equating methods. This yielded five different A and 
B coefficients for each data set. Next, the recovery of the underlying parameters 
was evaluated using root mean square differences (RMSDs) between the tr**nsformed 
estimates and the underlying parameters. The smaller the RMSDs, the better the 
equating method. In addition, correlations between the estimates and the generating 
parameters were also computed. (Note: Correlations are scale-free meaning that 
equating is not required.) 

Equating Coefficients. Equating coefficients obtained from each of the five 
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equating methods are given in Tables 1 and 2 for each replication for 300 examinee 
samples for the 10- and 30- item tests, respectively. Results for the large sample, 
1,000 examinee conditions are given in Tables 3 and 4 for the 10- and 30- item tests, 
respectively. 

Insert Tables 1, 2, 3, and 4 about here 

Across all data sets, differences in A coefficients w^re quite small, occasionally 
arising in the second decimal jjlace but more often in the third or fourth. Differences of 
this magnitude are essentially zero. In the small sample condition with 300 examinees, 
differences among A values tended to be very small and not meaningfully different 
from 1, the theoretically expected value for a recovery study. A values were basically 
the same for all five equating methods in all four test length by sample size conditions. 
Differences which did occur were primarily in the second through fourth decimal places 
and, consequently, were essentially zero. There was a tendency for A values to differ 
less from 1.0 for the longer 30-item test but none of these differences was greater than 
.05. No consistent differences were observed among the five equating methods. 

Differences in B coefficients also were very small, some ocurring in the second 
decimal place but more in the third or fourth. As noted for the A coefficients, 
differences of this magnitude are essentially zero. All of the B coefficients were 
essentially zero, the theoretically expected value for this recovery study. There was a 
slight tendency for B values to be closer to zero for the large sample and longer test 
condition. No consistent differences were observed among the five equating methods. 

Recovery of Underlying Parameters. Recovery of the underlying parameters 
with each method was evaluated with root mean square differences (RMSD) between 
the transformed estimates and the generating parameters. RMSDs for the 300 
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examinee samples are given in Tables 5 and 6 for the 10- and 30- item tests, 
respectively. RMSDs for the 1,000 examinee samples are given in Tables 7 and 8 
for the 10- and 30- item tests, respectively. Mean values for RMSDs for both equating 
coefficients Eire given for the five equating methods at the bottom of each table. 
Correlations between estimates and the generating parameters are also given in these 
tables. 



Insert Tables 5, 6, 7, and 8 about here 

Recovery of discrimination parameters was good in all data sets. Correlations 
between estimated discrimination and generating parameters ranged from .731 to .954 
in the 300 examinee samples and from .912 to .961 in the 1,000 examinee samples. All 
correlations indicate good recovery Mean RMSD values ranged from .1231 to .1494 
for the small sample conditions and .0789 to .0903 in the large sample conditions. In 
addition, smaller RMSDs were observed within each test length for the large sample 
conditions. The RMSDs for discrimination also indicate good recovery. No differences 
in recovery were observed among equating methods. 

Recovery of location parameters was good under each of the conditions simulated. 
Correlations with underlying parameters were nearly perfect, ranging from .992 to 
.999. RMSDs in the 300 examinee test conditions were relatively small (average 
RMSDs r?,nged from .1274 to .1449) but were about twice as large as those in the 
1,000 examinee conditions (average RMSDs ranged from .0631 to .0744), Values 
of RMSDs indicated excellent recovery of location parameters. No differences were 
observed among the five equating methods. 

RMSDs showed very slight differences among individual data sets within each 
of the test length by sample size conditions. For average discrimination or location 
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parameters, however, no meaningful diiferences were found among the five equating 
methods under any of the simulated conditions. What differences were observed 
were so small (essentially the only differences that were observed were in the second 
through fourth decimal places) as to be essentially non-existent. 

Discussion 

The comparability of IRT item parameter estimates across different tests mea- 
suring the same underlying trait is an important matter for test developers and re- 
searchers since all decisions about examinees are derived from these estimates. Efforts 
to reduce errors in transformation of estimates obtained in different groups are im- 
portant concerns. In the present paper, we compared five methods for linking item 
parameter estimates for graded response models. These five methods are among the 
more commonly used for transforming item and ability parameter estimates from one 
metric to another. The comparisons were based on measures of similarity to the gen- 
erating parameters of the item parameter estimates obtained following transformation 
via each of the methods to the underlying metric. 

Differences in equating coefficients were quite small under all sample size by test 
length conditions. In the small sample conditions, there wais a slight tendency for A 
and B coefficients to be closer to the theoretic«tlly expected values for the 30-item 
tests. These differences, however, occurred only in the second or third decimal places 
and, as such, were essentially non-existent. In the large sample conditions, similar 
lack of deviations from values of 1.0 for A and 0 for B were found. 

Results under the conditions simulated indicated that recovery was good for all 
conditions. Recovery was slightly better for the long test and the large sample 
conditions but differences among all the simulated conditions actually were quite 
small. Further, essentially no differences were observed among linking methods. 
These results arc consistent with previous research in that, when the underlying 

1 . 
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ability and item difficulty distributions match, estimation of location parameters is 
optimal and of discrimination parameters tends to be generally good. Recovery of 
underlying parameters under such conditions also tends to be very good so that 
differences among equating methods should be quite minimal. 

One of the equating methods compared in this study, the minimum chi- 
square method, required use of the ofF-diagonal covariance terms for each item. 
Unfortunately, currently available computer programs do not provide values of these 
ofF-diagonal terms so they were not available for the present study. The chi-square (or 
the quadratic function) that was minimized was obtained based only on the diagonal 
terms of 'he variance-covariance matrix. Thus, S^-^^ in Equation 30 is a diagonal 
matrix. The resulting statistic, and the one used in the present study, is related to 
Pearso^^«5 (1926) coefficient of racial likeness (CRL). It has been found to be highly 
correlated with the Mahalanobis (i.e., Xjmj) Equation 26 £uid recommended as 
a replacement for because of its computational ease (Gower, 1972; Mardia, 1977; 
Penrose, 1954). 

Finally, given the results of this study, one should feel relatively comfortable using 
any of the five equating methods when ability and item location distributions are well- 
matched. That is, when item parameters ase estimated under optimal conditions such 
as used in the present study, little if any real differences appear to be present among 
these equating methods. Additional research on situations in which item parameters 
are less well-estimated would be important in further developing our understanding 
of the effectiveness of each of these equating methods. Under the present conditions, 
however, the results do not indicate any reason for selecting one method over the 
other. Neither is there any theoretical rationale for selection one method over the 
other. The simplest method to use is clearly the LH method. The minimum chi-square 
method has some advantage in ease of implementation over the test characteristic 
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curve method but both methods are far more computationally intensive than the LH 
method. 

Graded response models are particularly appropriate for constructed response 
item formats such as found in many types of performance tests. Development and 
comparison of the procedures for equating graded response items as was done in this 
study should provide some useful information toward solving some of the equating 
problems present in performance and constructed response types of tests. 
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TABLE 1 

Equating Coefficients A and B Jot SOO-Examinee-lO-Item Data Set 



Rep." 


Equating 
Coefficient 






Method 






LH 


MS 


WMS 


MCS 


TCC 


1st 


A 


.9197 


.9257 


.9592 


.9357 


.9220 




B 


.0659 


.0665 


.0021 


.0829 


.0725 


2nd 


A 


1.0232 


.9731 


1.0132 


1.0078 


1.0009 




B 


.0554 


.0496 


.0008 


.0328 


.0477 


3rd 


A 


,9599 


.9190 


.9244 


.9465 


.9352 




B 


-.0252 


-.0257 


-.0008 


-.0364 


-.0294 


4th 


A 


.9970 


.9910 


1.0322 


1.0124 


.9878 




B 


-.0270 


-.0264 


-.0011 


-.0427 


-.0265 


5th 


A 


.9778 


.9631 


.9901 


.9819 


.9706 




B 


.0032 


.0020 


-.0003 


-.0125 


-.0075 



'Replication. 
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TABLE 2 

Equating Coefficients A and B for SOO-Examinee-SO-Item Data Set 



Rep." 


Equating 
CoefScient 






Method 






LH 


MS 


WMS 


MCS 


TCC 


1st 


A 


.9943 


.9811 


1.0173 


1.0064 


.9861 




B 


.0198 


.0191 


.0001 


.0147 


.0218 


2nd 


A 


1.0336 


.9857 


1.0402 


1.^56 


1.0105 




B 


-.0454 


-.0449 


-.0004 


-.0468 


-.0494 


3rd 


A 


1.0305 


1.0082 


1.0332 


1.0280 


1.0174 




B 


-.0362 


-.0362 


-.0004 


-.0499 


-.0399 


4th 


A 


1.0561 


1.0437 


1.0606 


1.0569 


1.0461 




B 


-.0028 


-.0030 


.0001 


.0069 


-.0028 


5th 


A 


1.0185 


.9928 


1.0279 


1.0185 


1.0018 




B 


-.0147 


-.0151 


-.0001 


-.0067 


-.0122 
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TABLE 3 

Equating Coefficients A and B for lOOO-Examinee-lO-Item Data Sets 



Rep.* 


Equating 
Coefficient 






Method 






LH 


MS 


WMS 


MCS 


TCC 


Ist 


A 


,9- 


.9580 


.9575 


.9628 


.9627 




B 


-.0295 


-.0302 


-.0006 


-.0260 


-.0286 


2nd 


A 


.9836 


.9654 


.8792 


.9779 


.9681 




B 


-.0146 


-.0153 


-.0002 


-.0085 


-.0160 


3rd 


A 


.9504 


.9424 


.9600 


.9511 


.9470 




B 


-.0266 


-.0266 


-.0006 


-.0261 


-.0243 


4th 


A 


.9785 


1.0025 


1.0079 


.9961 


.9917 




B 


.0013 


.0026 


-.0001 


-.0049 


.0022 


5th 


A 


.9780 


.9706 


.9734 


.9722 


,9658 




B 


.0103 


.0108 


.0002 


.0074 


.0096 
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TABLE 4 

Equating Coefficienta A and B for 1000-Examinee-SO-Item Data Sets 



Rep.'' 


Equating 
Coefficient 






Method 






LH 


MS 


WMS 


MCS 


TCC 


Ist 


A 


1.0165 


1.0103 


1.0253 


1.0172 


1.0155 




B 


-.0165 


-.0165 


-.0002 


-.0208 


-.0203 


2nd 


A 


1.0160 


1.0135 


1.0272 


1.0205 


1.0143 




B 


.0086 


.0080 


.0000 


.0023 


.0067 


3rd 


A 


.9965 


.9858 


1.0091 


.9955 


.9905 




B 


-.0105 


-.0104 


-.0001 


-.0129 


-.0113 


4th 


A 


.9854 


.9818 


.9932 


.9882 


.9827 




B 


-.0228 


-.0226 


-.0002 


-.0182 


-.0218 


5th 


A 


.9935 


.9879 


1.0083 


.9960 


.9902 




B 


-.0036 


-.0037 


-.0001 


-.0088 


-.0069 
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TABLE 5 

Root Mean Squared Differences and Correlation for SOO-Examinee- 10- Item 

Data SeU 



Method 



Rep." 


Parameter 


LH 


MS 


WMS 


MCS 


TCC 


Corr. 


1st 


Discrimination 
Location 


.0752 
.1115 


.0751 
.1116 


.0958 
.1389 


.0780 
.1139 


.0750 
.1117 


.954 
.995 


2nd 


Discrimination 
Location 


.1602 
.1606 


.1833 
.1459 


.1617 
.1647 


.1632 
.1552 


.1658 
.1515 


.731 
.992 


3rd 


Discrimination 
Location 


.1196 
.1497 


.1453 
.1379 


.1930 
.1470 


.1241 
.1444 


.1309 
.1406 


.903 
.993 


4th 


Discrimination 
Location 


.1339 
.1276 


.1353 
.1270 


.1388 
.1401 


.1334 
.1318 


.1364 
.1268 


.881 
.994 


5th 


Discrimination 
Location 


.1268 
.1300 


.1305 
.1279 


.1272 
.1336 


.1266 
.1319 


.1281 
.1290 


.847 
.994 


Average 


Discrimination 
Location 


.1231 
.1359 


.1339 
.1301 


.1433 
.1449 


.1251 
.1354 


.1272 
.1319 


.863 
.994 
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TABLE 6 

Root Mean Squared Differences and Correlation for SOO- Examinee- SO- Item 

Data Seta 



Method 



Rep." 


Parameter 


T TT 

LH 


MS 


WMS 


MCb 


TCC 


Uorr. 


Ist 


Discrimination 
Location 


.1234 
.1307 


.1273 
.1288 


.1246 
.1396 


.1228 
.1340 


.1254 
.1293 


.891 
.994 


2nd 


Discrimination 
Location 


.1550 
.1566 


.1772 
.1426 


.1545 
.1663 


.1564 
.1529 


.1616 
.1474 


.777 
.992 


3rd 


Discrimination 
Location 


.1296 
.1136 


.1374 
.1094 


.1292 
.1200 


.1300 
.1137 


,1331 
.1105 


.876 
.996 


4th 


Discrimination 
Location 


.1303 
.1235 


.1327 
.1220 


.1300 
.1244 


.1302 
.1240 


.1321 
.1221 


.837 
.995 


5th 


Discrimination 
Location 


.1623 
.1394 


.1725 
.1344 


.1609 
.1435 


.1623 
.1396 


.1678 
.1354 


.851 
.993 


Average 


Discrimination 
Location 


.1401 
.1328 


.1494 
.1274 


.1398 
.1388 


.1403 
.1328 


.1440 
.1289 


.847 
.994 
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TABLE 7 

Root Mean Squared Differences and Correlation for 1000- Examinee- 10-Item 

Data Sets 



Method 



Kep. 


Parameter 


Ln 


Mb 


WMS 


MGb 




dorr. 


1st 


Discrimination 
Location 


.0663 
.0444 


.0689 
.0420 


.0692 
.0513 


.0668 
.0427 


,0668 
.0425 


.959 
.999 


2nd 


Discrimination 
Location 


.0956 
.0865 


.1038 
.0820 


.0968 
.0855 


.0972 
.0841 


.1020 
.0822 


.955 
.997 


3rd 


Discrimination 
Location 


.0966 
.0674 


.0985 
.0664 


.0967 
.0749 


.0965 
.0675 


.0972 
.0669 


.923 
.998 


4tli 


Discrimination 
Location 


.0895 
.0708 


.0972 
.0661 


.1007 
.0667 


.0938 
.0667 


.0920 
.0669 


.912 
.998 


5th 


Discrimination 
Location 


.0830 
.0592 


.0831 
.0591 


.0827 
.0603 


.0828 
.0593 


.0844 
.0593 


.961 
.999 


Average 


Discrimination 
Location 


.0860 
.0655 


.0903 
.0631 


.0892 
.0677 


.0874 
.0641 


.0885 
.0636 


.942 
.998 
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TABLE 8 

jRl- ^ Mean Squaied Differences and Correlation for lOOO-Examinee-SO-Iiem 

Data Sets 



Method 



Kep. 


Parameter 


T XJ 


mo 


WMa 




TOO 


Uorr. 


1st 


Discrimination 
Location 


.0829 
.0695 


.0835 
.0690 


.0839 
.0734 


.0829 
.0698 


.0829 
.0695 


.923 
.998 


2nd 


Discrimination 
Location 


.0792 
.0713 


.0795 
.0712 


.0798 
.0739 


.0791 
.0721 


.0794 
.0712 


.943 
.998 


3rd 


Discrimination 
Location 


.0776 
.0734 


.0801 
.0719 


.0792 
.0784 


.0777 
.0732 


.0786 
.0723 


.937 
.998 


4th 


Discrimination 
Location 


.0772 
.0771 


.0777 
.0708 


.0775 
.0759 


.0771 
.0716 


.0776 
.0709 


.943 
.998 


5th 


Discrimination 
Location 


.0779 
.0676 


.0788 
.0671 


.0800 
.0702 


.0778 
.0683 


.0783 
.0673 


.939 
.998 


Average 


Discrimination 
Location 


.0790 
.0655 


.0799 
.0631 


.0801 
.0677 


.0781 
.0641 


.0794 
.0636 


.937 
.998 
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